Splat Provides Programmers a Fast and Accurate Study of Memory Behavior without the Necessity of a Costly Memory Simulator. the Tool Is Suitable for Use as a Step in an Iterative Optimization
نویسندگان
چکیده
Memory performance is becoming a major bottleneck in current microprocessors. A great deal of research has aimed at developing techniques for improving memory performance. Some of these techniques rely on hardware alone, but many require programmer or compiler support. Examples of the latter are software prefetching, blocking, and copying. To use these techniques effectively, the programmer must have some knowledge of the program’s behavior. For instance, prefetching is useful only if it is limited to instructions that frequently produce cache misses. Adding a prefetch instruction to every memory instruction could result in significant performance degradation. These techniques might also require quantification of the different types of cache misses (see sidebar, page 60). For instance, microprocessors can avoid compulsory misses through both hardware and software prefetching. Blocking, or tiling, is a method of avoiding capacity misses; copying and padding are techniques for reducing the effect of conflict misses. Many processors provide hints in their memory instructions that the compiler can use for optimizing memory performance. Examples of such hints are the PowerPC’s cache bypass facility and the hints incorporated by the IA-64 instruction set. Effective use of these hints requires information about the program’s locality behavior. The process of obtaining information about a program’s locality characteristics is data locality analysis. Traditionally, this analysis takes place either at compile time or at runtime. The former approach incurs low overhead but is relatively inaccurate because the compiler lacks some information. The runtime approach usually takes the form of a memory hierarchy simulation, which is quite accurate but very slow. In this article, we introduce SPLAT (Static and Profiled Data Locality Analysis Tool). The tool’s purpose is to provide a fast study of memory behavior without the necessity of a costly memory simulator. SPLAT consists of a static locality analysis enhanced by simple profiling data. Its overhead is low because it performs most of the analysis at compile time, and because the required profiling support is just a basic-block-execution count. Many commercial compilers support this profiling option. Compared with simulation techniques, SPLAT’s estimation technique is highly accurate for numeric codes. The tool is useful not only for compilers but also for programmers. To tune a program, programmers should know its performance, the Jesús Sánchez Antonio González
منابع مشابه
A NEW TWO STEP CLASS OF METHODS WITH MEMORY FOR SOLVING NONLINEAR EQUATIONS WITH HIGH EFFICIENCY INDEX
It is attempted to extend a two-step without memory method to it's with memory. Then, a new two-step derivative free class of without memory methods, requiring three function evaluations per step, is suggested by using a convenient weight function for solving nonlinear equations. Eventually, we obtain a new class of methods by employing a self-accelerating parameter calculated in each iterative...
متن کاملFast Reconstruction of SAR Images with Phase Error Using Sparse Representation
In the past years, a number of algorithms have been introduced for synthesis aperture radar (SAR) imaging. However, they all suffer from the same problem: The data size to process is considerably large. In recent years, compressive sensing and sparse representation of the signal in SAR has gained a significant research interest. This method offers the advantage of reducing the sampling rate, bu...
متن کاملDesign of a Multiplier for Similar Base Numbers Without Converting Base Using a Data Oriented Memory
One the challenging in hardware performance is to designing a high speed calculating unit. The higher of calculations speeds in a computer system will be pointed out in terms of performance. As a result, designing a high speed calculating unit is of utmost importance. In this paper, we start design whit this knowledge that one multiplier made of several adder and one divider made of several su...
متن کاملChaotic Genetic Algorithm based on Explicit Memory with a new Strategy for Updating and Retrieval of Memory in Dynamic Environments
Many of the problems considered in optimization and learning assume that solutions exist in a dynamic. Hence, algorithms are required that dynamically adapt with the problem’s conditions and search new conditions. Mostly, utilization of information from the past allows to quickly adapting changes after. This is the idea underlining the use of memory in this field, what involves key design issue...
متن کاملA new iterative with memory class for solving nonlinear equations
In this work we develop a new optimal without memory class for approximating a simple root of a nonlinear equation. This class includes three parameters. Therefore, we try to derive some with memory methods so that the convergence order increases as high as possible. Some numerical examples are also presented.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000